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PjMuch of the power of the UNIX ™ operating system comes from a style of 
ilibgram design that makes programs easy to use and, more importantly, easy 
«» combine with other programs. This style is distinguished by the use of 
tpftware tools, and depends more on how the programs fit into the program- 
Sjifag environment — how they can be used with other programs— than on how 
%lf are designed internally. But as the system has become commercially 
lacttttful and has spread widely, this style has often been compromised, to 
detriment of all users. Old programs have become encrusted with dubious 
n Newer programs are not always written with attention to proper 
'ttparation of function and design for interconnection. This paper discusses 
The elements of program design, showing by example good and bad design, 
‘and Indicates some possible trends for the future. 


>J* INTRODUCTION 

•rThe UNIX operating system has become a great commercial success, 
«nd is likely to be the standard operating system for microcomputers 
||od some mainframes in the coming years. 

There are good reasons for this popularity. One is portability: the 
fipcnting system kernel and the applications programs are written in 
programming language C, and thus can be moved from one type 
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Of computer to another with much less effort than would be involved 
in recreating them in the assembly language of each machine. Essen- 
tially, the same operating system therefore runs on a wide variety of 
computers, and users need not learn a new system when 1 T| 
comes along. Perhaps more important, vendors who sell the UN 
system need not provide new software for each new machine; instead, 
( h e ir software can he compiled and run without change on any hard- 
ware, which makes the system commercially attractive. 1 here is also 
an element of zealotry: users of the system tend to b e enthusiastic and 
to expect it wherever they go; the students who used the ^/X sys ern 
in universities a few years ago are now in the 30 b market and often 

demand it as a condition of employment. 

Rut the UNIX system was popular long before it was even portable, 
let commercial success. The reasons for that arc more rater- 

“Except for the initial PDP-7* version, the UNIX system was written 
for the PDP-11* computer, which was deservedly very popular. The 
PDP-11 computers were powerful enough to do real computing, bu 
small enough to be affordable by small organizations such as academic 
departments in universities. 

The early UNIX system was smaller but more effective, and tech 
nically more interesting, than competing systems on the sa ™ e hard- 
ware It provided a number of innovative applications of computer 
science showing the benefits to be obtained by a judicious blend ^of 
theory mid practice. Examples include the yacc parser-generator, the 
, file comparison program, and the pervasive use of regular expres- 
sions to describe string patterns. These led in turn to new program- 
ming languages and interesting soltware for applications like program 
development, document preparation, and circuit design. 

Since the system was modest in size, and since essentially everything 
was written in C, the software was easy to modify, to customize for 
particular applications, or merely to support a view ol the world 
different from the original. (This ease of change is also a weakness of 
course, ns evidenced by the plethora of dillerent versions of the 

Sy Finally, the UNIX system provided a new style of computing, a new 
way of thinking of how to attack a problem with a computer. This 
style was based on the use of tools: using programs separately or m 
combination to get a job done, rather than doing it by hand by 
monolithic self-sufficient subsystems, or by special-purpose, one-time 
programs. This has been much discussed in the literature, so we dont 
need to repeat it here; see Ref. 1, for example. 
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NAME 

SYNOPSIS 


CAT (I) 


cat — concatenate and print 
cat filel . . . 




J.:' 

1 ;* 


•si 


DESCRIPTION cat reads each file in sequence and writes it on 
the standard output stream. Thus: 

cat file 

is about the easiest way to print a file. Also: 
cat filel file2 >file3 

is about the easiest way to concatenate files. 

If no input file is given cat reads from the 
standard input file. 

FILES 

SEE ALSO pr, cp 

DIAGNOSTICS none; if a file cannot be found it is ignored. 

BUGS 

OWNER ken, dmr 

Fig. 1 — Manual page for cat, UNIX 1st edition, November 1971. 

II. AN EXAMPLE: CAT 

The style of use and design of the tools on the system are closely 
related. The style is still evolving, and is the subject of this essay: in 
particular, how the design and use of a program fit together, how -the 
tools fit into the environment, and how the style influences solutions 
to new problems. The focus of the discussion is a single example, the 
program cat, which concatenates a set of files onto its standard output. 
Cat is simple, both in implementation and in use; it is essential to the 
UNIX system, and it is a good illustration of the kinds of decisions 
that delight both supporters and critics of the system. (Often a single 
property of the system will be taken as an asset or as a fault by 
different audiences; our audience is programmers, because the UNIX 
environment is designed fundamentally for programming.) Even the 
name cat is typical of UNIX program names: it is short, pronounce- 
able, but not conventional English for the job it does. (For an opposing 
viewpoint, see Ref. 2 .) Most important, though, cat in its usages and 
variations exemplifies UNIX program design style and how it has 
been interpreted by different communities. 

Figure 1 is the manual page for cat from the UNIX 1st edition* 
manual. Evidently, cat copies its input to its output. The input is 
normally taken from a sequence of one or more files, but it can come 
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from the standard input. The output is the standard output. The 
manual suggested two uses, the general fde cdpy: 

cat filel file2 >file3 

and printing a file on the terminal. 

cat file 

The general case is certainly what was intended in the design of the 
program. Output redirection (provided by the > operator, implemented 
by the UNIX shell) makes cat a fine general-purpose file concatenate* 
and a valuable adjunct for other programs, which can use cat to 
process filenames, as in: 

cat file f ile2 • • • \ other-program 

The fact that cat will also print on the terminal is a special case. 
Perhaps surprisingly, in practice it turns out that the special case is 

the main use of the program.* . . . , 

The design of cat is typical of most UNIX programs: it implements 

one simple but general function that can be used in many different 
applications (including many not envisioned by the original author). 
Other commands are used for other functions. For examp e, there are 
separate commands for file system tasks like renaming files, deleting 
them, or telling how big they are. Other systems instead lump these 
into a single “file system” command with an internal structure and 
command language of its own. (The PIP file copy program found on 
CP/M 1 or RSX-11* operating systems is an example.) That approach 
is not necessarily worse or better, but it is certainly against the UNIX 
philosophy. Unfortunately, such programs are not completely alien to 
the UNIX system — some mail-reading programs and text editors, tor 
example are large self-contained “subsystems” that provide their own 
complet; environments and mesh poorly with the rest of the system 
Most such subsystems, however, are usually imported from or inspired 
by programs on other operating systems with markedly different 
programming environments. 

III. CAT-v 

There are some significant advantages to the traditional UNIX 
system approach. The most important is that the surrounding envi- 


"The use of cat to feed a single input fde to a program has to some degree 
superseded the shell’s < operator, which illustrates that general-purpose constructs- 
like cat and pipes-are often more natural than convenient specml-purpose ones. 
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ronment— the shell and the programs it can invoke— provides a uni- 
1? form acc _ ess to s y stem facilities. File name argument patterns are 
| expanded by the shell for all programs, without prearrangement in 
r each command. The same is true of input and output redirection. 

Pipes are a natural outgrowth of redirection. Rather than decorate 
H eac f 1 command with options for all relevant pre- and post-processing 
| each program expects as input, and produces as output, concise and 
Bis header-free textual data that connect well with other programs to do 
m f*u eSt of the task at hand - ft takes some programming discipline to 
L,. build a program that works well in this environment— primarily, to 
avoid the temptation to add features that conflict with or duplicate 
services provided by other commands— but it’s well worthwhile 
Growth is easy when the functions are well separated. For example 
|| the 7th e d >tion shell was augmented with a backquote operator that 
converts the output of one program into the arguments to another as 
Mt- iin 

|fe 

S’ cat cat filelist 

it 

C. No changes were made in any other program when this operator was 
WL inven ted; because the backquote is interpreted by the shell, all pro- 
grams called by the shell acquire the feature transparently and uni- 
; formly. If special characters like backquotes were instead interpreted 
r ; even by calling a standard subroutine, by each program that found the 
5 |: feature appropriate, every program would require at least recompila- 
| - tion whenever someone had a new idea. Not only would uniformity be 
§1 aar d to enforce, but experimentation would be harder because of the 
| effort of installing any changes. 

The UNIX 7th edition system introduced two changes in cat. First, 
t ^ es that could not be read, either because of denied permissions or 
1 ? imp | e nonexistence, were reported rather than ignored. Second, and 
& less desirable, was the addition of a single optional argument -u which 
If f 01 ^ cat t0 unbuffer its output (the reasons for this option,’ which 
i has disappeared again in the 8th edition of the system, are technical 
I and irrelevant here.) 

But the existence of one argument was enough to suggest more, and 
other versions of the system soon embellished cat with features. This 
list comes from cat on the Berkeley distribution of the UNIX system: 

“ s Strip multiple blank lines to a single instance. 
i| -n Number the output lines. 

S -b Number only the nonblank lines. 
r j -v Make nonprinting characters visible. 

-ve Mark ends of lines. 

■ -vt Change representation of tab. 
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In System V, there are similar options and even a clash of namut; 
-s instructs cat to be silent about nonexistent files. But none of thto 
options is an appropriate addition to cat; the reasons get to thehea 
of how UNIX programs are designed and why they work well togethi 
It’s easy to dispose of (Berkeley) -s, -n, and -b: all of these jobs a 
readily done with existing tools like sed and awk. For example, 
number lines, this awk invocation suffices: 

awk ' 1 print nr "\t" $o) ' filenames 


bC'H 


it 

I® 

m 

1H" 

is; 


If line numbering is needed often, this command can be P&cka^l 
under a name like linenumber and put in a convenient public pl« 
Another possibility is to modify the pr command, whose job » 
format text such as program source for output on a line printl 
Numbering lines is an appropriate feature in P r; in fact UNIXSyM 
V P r has a -n option to do so. There never was a need to modify c« 

these options are gratuitous tinkering. . 

But what about -v? That prints nonprinting characters in a visit*/ 
representation. Making strange characters visible is a genuinely iw* 
function for which no existing program is suitable, (“sed -n 1 ,ttjg 
closest standard possibility, aborts when given very long input 
which are more likely to occur in files containing nonprinting cH 
acters.) So isn’t it appropriate to add the -v option to cat to mate 
strange characters visible when a lile is printed? 

The answer is “No”. Such a modification confuses what cat -s job 
is-concatenating files-with what it happens to do in a comnKJB: 
special case, showing a file on the terminal. A UNIX program shwKj 
do one thing well, and leave unrelated tasks to other programs. Cat*|| 
job is to collect the data in files. Programs that collect data shouldoV 
change the data; cat therefore shouldn’t transform its input. 

The preferred approach in this case is a separate program that dew « 
with nonprintable characters. We called ours vis (a suggestive, pro- § 
nounceable, non-English name) because its job is to make thu£| 
visible. As usual, the default is to do what most users will want— mate 1 
strange characters visible— and as necessary include options for vark:| 
at ions on that theme. By making vis a separate program, related 
useful functions are easy to provide. For example, the option -s strip* ; 
out (i.e., discards) strange characters, which is handy for dealing wittf 
files from other operating systems. Other options control the treatment J 
and format of characters like tabs and backspaces that may or nuy| 
not be considered strange in different situations. Such options mate| 
sense in vis because its focus is entirely on the treatment of sucb| 
characters. In cat, they require an entire sublanguage within the-*' 
option, and thus get even further away from the fundamental purpose.: 
of that program. Also, providing the function in a separate progr«tt| 
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kes convenient options such as -s easier to invent, because it 
lates the problem as well as the solution. 

One possible objection to separate programs for each task is effi- 
ncy. For example, if we want numbered lines and visible characters, 
|t is probably more efficient to run the one command 


M 

u 


cat — n — v file 


| 

1 


Qmn the two-element pipeline 
■linenumber f ile | vis 

practice, however, cat is usually used with no options, so it makes 
jitnse to have the common cases be the efficient ones. The current 
rch version of the cat command is actually about five times 
tetter than the Berkeley and System V versions because it can process 
tain large blocks instead of the byte-at-a-time processing that might 
required if an option is enabled. Also, and this is perhaps more 
portant, it is hard to imagine any of these examples being the 
tleneck of a production program. Most of the real time is probably 
liken waiting for the user’s terminal to display the characters, or even 
|®r the user to read them. 

Separate programs are not always better than wider options; which 
te better depends on the problem. Whenever one needs a way to 
•jperform a new function, one faces the choice of whether to add a new 
*pUon or write a new program (assuming that none of the program- 
‘ le tools will do the job conveniently). The guiding principle for 
gthe choice should be that each program does one thing. Options 
appropriately added to a program that already has the right 
ionality. If there is no such program, then a new program is 
felted for. In that case, the usual criteria for program design should 
used: the program should be as general as possible, its default 
vior should match the most common usage, and it should coop- 
with other programs. 


IV. FAST TERMINAL LINES 

Let’s look at these issues in the context of another problem, dealing 
-ilith fast terminal lines. The first versions of the UNIX system were 
/Mitten in the days when 150 baud was “fast” and all terminals used 
t. Today, 9600 baud is typical, and hard-copy terminals are rare, 
should we deal with the fact that output from programs like cat 
Is off the top of the screen faster than one can read it? 

There are two obvious approaches. One is to tell each program about 
||jf properties of terminals, so it does the right thing (whether by 
n or automatically). The other is to write a command that handles 
's, and leave most programs untouched. 
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An example of the first approach is Berkeley’s version of the Is 
command, which lists the file names in a directory. Let us call it lsc 
to avoid confusion. The 7th edition is command lists fde names in a 
single column, so for a large directory, the list of file names disappears 
off the top of the screen at great speed. The lsc command prints in 
columns across the screen (which is assumed to be 80 columns wide), 
so there are typically four to eight times as many names on each line, 
and thus the output usually fits on one screen. The option -1 can be 
used to get the old single-column behavior. 

Surprisingly, lsc operates differently if its output is a fde or pipe: 

lsc 

produces output different from 
lsc | cat 

The reason is that lsc begins by examining whether its output is a 
terminal, and prints in columns only if it is. By retaining single- 
column output to files or pipes, lsc ensures compatibility with pro- 
grams like grep or wc, which expect things to be printed one per line. 
This ad hoc adjustment of the output format depending on the desti- 
nation is not only distasteful, it is unique— no standard system com- 
mand has this property. 

A more insidious problem with lsc is that the columnation facility, 
which is actually a useful, general function, is built in and thus 
inaccessible to other programs that could use a similar compression. 
Programs should not attempt special solutions to general problems. 
The automatic columnation in lsc is reminiscent of the “wild cards 
found in some systems that provide fde name pattern matching only 
for a particular program. The experience with centralized processing 
of wild cards in the system shell shows overwhelmingly how important 
it is to centralize the function where it can be used by all programs. 

One solution for the l s problem is obvious — a separate program for 
columnation, so that columnation into, say, five columns is just 

Is 1 5 

It is easy to build a first-draft version with the multicolumn option of 
pr. The commands 2,3, etc., are all links to a single fde: 

pr — $ 0 -t -11 $* 

$o is the program name (2,3, etc.), so -$o becomes -n, where n is 
the number of columns that pr is to produce. The other options 
suppress the normal heading, set the page length to one line, and pass 
the arguments on to pr. This implementation is typical of the use of 
tools— it takes only a moment to write, and it serves perfectly well for 
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most applications. If a more general service is desired, such as auto- 
matically selecting the number of columns for optimal compaction, a 
C program is probably required, but the one-line implementation above 
satisfies the immediate need and provides a base for experimentation 
with the design of a fancier program, should one become necessary. 

Similar reasoning suggests a solution for the general problem of 
data flowing off screens (columnated or not): a separate program to 
take any input and print it a screen at a time. Such programs are by 
now widely available, under names like pg and more. This solution 
affects no other programs, but can be used with all of them. As usual, 
once the basic feature is right, the program can be enhanced with 
options for specifying screen size, backing up, searching for patterns, 
and anything else that proves useful within that basic job. 

There is still a problem, of course. If the user forgets to pipe output 
into pg, the output that goes off the top of the screen is gone. It would 
be desirable if the facilities of pg were always present without having 
to be requested explicitly. 

There are related useful functions that are typically only available 
as part of a particular program, not in a central service. One example 
is the history mechanism provided by some versions of the UNIX 
shell: commands are remembered, so it’s possible to review and repeat 
them, perhaps with editing. But why should this facility be restricted 
to the shell? (It’s not even general enough to pass input to programs 
called by the shell; it applies to shell commands only.) Certainly other 
programs could profit as well; any interactive program could benefit 
from the ability to re-execute commands. More subtly, why should the 
facility be restricted to program input ? Pipes have shown that the 
output from one program is often useful as input to another. With a 
little editing, the output of commands such as Is or make can be 
turned into commands or data for other programs. 

Another facility that could be usefully centralized is typified by the 
editor escape in some mail commands. It is possible to pick up part of 
a mail message, edit it, and then include it in a reply. But this is all 
done by special facilities within the mail command and so its use is 
restricted. 

Each such service is provided by a different program, which usually 
has its own syntax and semantics. This is in contrast to features such 
as pagination, which is always the same because it is only done by one 
program. The editing of input and output text is more environmental 
than functional; it is more like the shell’s expansion of file name 
metacharacters than automatic numbering of lines of text. But since 
the shell does not see the characters sent as input to the programs, it 
cannot provide such editing. The emacs editor provides a limited form 
of this capability, by processing all system command input and output, 
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but this is expensive, clumsy, and subjects the users to the complexities 
and vagaries of yet another massive subsystem (which isn t to criticize 
the inventiveness of the idea). 

A potentially simpler solution is to let the terminal or terminal 
interface do the work, with controlled scrolling, editing and retrans- 
mission of visible text, and review of what has gone before. We have 
used the programmability of the Blit terminal 1 ’— a programmable 
bitmap graphics display— to capitalize on this possibility, to good 

effect. 

The Blit uses a mouse to point to characters on the display, which 
can be edited, rearranged, and transmitted back to the UNIX system 
as though they had been typed on the keyboard. Because the terminal 
is essentially simulating typed input, the programs are oblivious to 
how the text was created; all the features discussed above are provided 
by the general editing capabilities of the terminal, with no changes to 
the UNIX programs. 

There are some obvious direct advantages to the Blit s ability to 
process text under the user’s control. Shell history is trivial: commands 
can be selected with the mouse, edited if desired, and retransmitted. 
Since from the terminal’s viewpoint all text on the display is equiva- 
lent, history is limited neither to the shell nor to command input 
Because the Blit provides editing, most of the interactive features of 
programs like mail are unnecessary; they are done easily, transpar- 
ently, and uniformly by the terminal. 

The most interesting facet of this work, however, is the way it 
removes the need for interactive features in programs; instead, the 
Blit is the place where interaction is provided, much as the shell is.the 
program that interprets file name matching metacharacters. Unfor- 
tunately, of course, programming the terminal demands access to a 
part of the environment that is off limits to most programmers, but 
the solution meshes well with the environment and is appealing in its 
simplicity. If the terminal cannot be modified to provide the capabil- 
ities, a user-level program or perhaps the UNIX system kernel itself 
could be modified fairly easily to do roughly what the Blit does, with 
similar results. 

V. CONCLUSIONS 

The key to problem solving on the UNIX system is to identify the 
right primitive operations and to put them at the right place. UNIX 
programs tend to solve general problems rather than special cases. In 
a very loose sense, the programs are orthogonal, spanning the space 
of jobs to be done (although with a fair amount of overlap for reasons 
of history, convenience, or efficiency). Functions are placed where 
they will do the most good: there shouldn’t be a pager in every program 
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that produces output any more than there should be file name pattern 
matching in very program that uses file names. 

One thing that the UNIX system does not need is more features. It 
is successful in part because it has a small number of good ideas that 
work well together. Merely adding features does not make it easier for 
users to do things— it just makes the manual thicker. The right 
solution in the right place is always more effective than haphazard 
hacking. 
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